Compression of Deep Convolutional Neural Network Using Additional Importance-Weight-Based Filter Pruning Approach

نویسندگان

چکیده

The success of the convolutional neural network (CNN) comes with a tremendous growth diverse CNN structures, making it hard to deploy on limited-resource platforms. These over-sized models contain large amount filters in layers, which are responsible for almost 99% computation. key question here arises: Do we really need all those filters? By removing entire filters, computational cost can be significantly reduced. Hence, this article, filter pruning method, process discarding subset unimportant or weak from original model, is proposed, alleviates shortcomings architectures at storage space and time. proposed strategy adopted compress model by assigning additional importance weights filters. help each learn its responsibility contribute more efficiently. We different initialization strategies about aspects prune accordingly. Furthermore, unlike existing approaches, method uses predefined error tolerance level instead rate. Extensive experiments two widely used image segmentation datasets: Inria AIRS, known segmentation: TernausNet standard U-Net, verify that our approach efficiently negligible no loss accuracy. For instance, could reduce 85% floating point operations (FLOPs) drop 0.32% validation This compressed six-times smaller seven-times faster (on cluster GPUs) than TernausNet, while accuracy less 1%. Moreover, reduced FLOPs 84.34% without deteriorating output performance AIRS dataset TernausNet. effectively number parameters retaining compact deployed any embedded device specialized hardware. show pruned very similar unpruned model. also report numerous ablation studies validate approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Pruning for Deep Neural Network Compression

In this work we present a method to improve the pruning step of the current state-of-the-art methodology to compress neural networks. The novelty of the proposed pruning technique is in its differentiability, which allows pruning to be performed during the backpropagation phase of the network training. This enables an end-to-end learning and strongly reduces the training time. The technique is ...

متن کامل

Structural Compression of Convolutional Neural Networks Based on Greedy Filter Pruning

Convolutional neural networks (CNNs) have state-of-the-art performance on many problems in machine vision. However, networks with superior performance often have millions of weights so that it is difficult or impossible to use CNNs on computationally limited devices or to humanly interpret them. A myriad of CNN compression approaches have been proposed and they involve pruning and compressing t...

متن کامل

EMG-based wrist gesture recognition using a convolutional neural network

Background: Deep learning has revolutionized artificial intelligence and has transformed many fields. It allows processing high-dimensional data (such as signals or images) without the need for feature engineering. The aim of this research is to develop a deep learning-based system to decode motor intent from electromyogram (EMG) signals. Methods: A myoelectric system based on convolutional ne...

متن کامل

A Deep Neural Network Compression Pipeline: Pruning, Quantization, Huffman Encoding

Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources. To address this limitation, We introduce a three stage pipeline: pruning, quantization and Huffman encoding, that work together to reduce the storage requirement of neural networks by 35× to 49× without affecting their accuracy. Our method...

متن کامل

Deep Columnar Convolutional Neural Network

Recent developments in the field of deep learning have shown that convolutional networks with several layers can approach human level accuracy in tasks such as handwritten digit classification and object recognition. It is observed that the state-of-the-art performance is obtained from model ensembles, where several models are trained on the same data and their predictions probabilities are ave...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2022

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app122111184